fix: Preserve integer values in round() for large Int64 and UInt64 inputs#22697
fix: Preserve integer values in round() for large Int64 and UInt64 inputs#22697pchintar wants to merge 1 commit into
Conversation
e565975 to
7832c0c
Compare
| // extra precision to accommodate potential carry-over. | ||
| let return_type = | ||
| match input_type { | ||
| input_type if is_integer_data_type(input_type) => input_type.clone(), |
There was a problem hiding this comment.
returns the integer type for all scales, but only non-negative scales are handled downstream. round(arrow_cast(125,'Int64'), -1) now fails, it returned 130.0 before this PR.
|
|
||
| match (value_scalar, args.return_type()) { | ||
| (value_scalar, return_type) | ||
| if is_integer_data_type(return_type) && dp >= 0 => |
There was a problem hiding this comment.
only covers dp >= 0. For dp < 0 with an integer value, needs handler for negative scale
| } | ||
|
|
||
| let arr: ArrayRef = match (value_array.data_type(), return_type) { | ||
| (input_type, return_type) |
There was a problem hiding this comment.
The no-op fast path requires decimal_places to be a non-negative scalar literal. When the scale is a column (or a negative literal), this guard is false and there's no other integer arm, so it hits exec_err!
CREATE TABLE t(v BIGINT, dp INT) AS VALUES (125,1),(125,-1);
SELECT round(v, dp) FROM t; -- errored; worked (Float64) before this PRThere was a problem hiding this comment.
tests only cover non-negative scalar scale. There's no coverage for negative scale, column scale, or negative scale on a column
| (value_scalar, Float64) | ||
| if is_integer_data_type(&value_scalar.data_type()) => | ||
| { | ||
| let value = ColumnarValue::Scalar(value_scalar.clone()) | ||
| .cast_to(&Float64, None)?; | ||
| match value { | ||
| ColumnarValue::Scalar(ScalarValue::Float64(Some(v))) => { | ||
| let rounded = round_float(v, dp)?; | ||
| Ok(ColumnarValue::Scalar(ScalarValue::Float64(Some(rounded)))) | ||
| } | ||
| ColumnarValue::Scalar(ScalarValue::Float64(None)) => { | ||
| Ok(ColumnarValue::Scalar(ScalarValue::Float64(None))) | ||
| } | ||
| _ => internal_err!( | ||
| "Unexpected datatype after casting integer argument to Float64" | ||
| ), | ||
| } | ||
| } |
There was a problem hiding this comment.
This (value_scalar, Float64) if is_integer_data_type(value) arm and theround_integer_array_to_float64! macro + the eight (IntN, Float64) array arms are unreachable:
|
cc @neilconway |
7832c0c to
97a7294
Compare
|
@kumarUjjawal Thnx for the review! I've updated the implementation so that integer inputs now stay as integer types instead of being converted through I also added tests for the cases you mentioned:
Additionally, I updated the affected sqllogictest expectations since the plan output changed after removing the unnecessary cast to I've also rerun the relevant tests and manually verified all these cases. |
|
@kumarUjjawal could you pls re-run the CI checks? thnx again |
|
@kumarUjjawal This failure seems to be unrelated to my code. It failed before running DataFusion tests because GitHub Actions could not pull the Docker image: So this is a CI infrastructure/network failure, not a test failure perhaps? |
Yeah ignore this, it's Github issue |
So wdyt of my PR? is it good enough to be approved? or is it too early to tell? |
Will review tomorrow! |
Sure, thank you! |
Which issue does this PR close?
Rationale for this change
round()should not change integer values when the scale is non-negative, since no fractional digits need to be rounded.Currently, core
round()coerces largeInt64values throughFloat64, causing precision loss:Before/Current Buggy Output:
Expected:
The Spark-compatible
round()also fails forUInt64values abovei64::MAXeven when the scale is non-negative:Before/Current Buggy Output:
What changes are included in this PR?
Preserved integer inputs in core
round()for non-negative scales instead of routing them throughFloat64.Preserved
UInt64values in Spark-compatibleround()when the scale is non-negative, avoiding unnecessaryUInt64 -> i64conversion.Added SQLLogicTest coverage for:
Int64values above2^53in coreround().UInt64::MAXin Spark-compatibleround().Are these changes tested?
Yes.
I also verified the core regression queries manually:
Result:
Result:
Are there any user-facing changes?
No.